Exploratory Data Analysis of Starbucks’ and Dunkin Donuts’ Nutritional Information

Final Project
Data Science 1 with R (STAT 301-1)

Author

Rohan Krishnamurthi

Published

December 8, 2023

Introduction

In this report, I am inquiring about the nutritional information of Starbucks’ and Dunkin Donuts’ food and drink items. The first dataset I am analyzing contains relevant nutritional information of all beverages available at Starbucks. The second dataset concerns all of the food items available at Starbucks. The third dataset concerns the nutritional information of all products, both food and drink items, available at Dunkin Donuts.

Motivation for research

I chose these datasets because I normally enjoy both Starbucks’s and Dunkin Donuts’ offerings. However, it must be noted that many students and I are currently boycotting Starbucks due to the company’s decision to not support for the Palestinian people given the humanitarian crisis they are currently experiencing. While I enjoy their products, I currently will not support the company due to this issue.

Nevertheless, I have had plenty of items from both their drink and food menus and I feel knowledgeable enough to work with data concerning both. Also, I really enjoy making and trying intricate coffee and tea drinks, so I’d be happy to work with data regarding Starbucks and Dunkin Donuts. I decided to include the food information as well as a challenge to work with multiple sets of data. Working with data from both Starbucks and Dunkin Donuts allows more depth in the exploration of nutrition information.

Research questions

The overarching research question for this EDA is what can one consume at either Starbucks or Dunkin Donuts in order to maintain a healthy, nutritious diet. There are many ways to address this question. Some initial querries are how do the calories, fat, protein and sugar vary by drink type or by food item at either retailer. To involve both food and drink items, I propose investigating which combination of food and drink items is most nutritious, at either Starbucks or Dunkin Donuts. A related question I have is which drink or food items are least nutritious and should be generally avoided. Lastly, I propose questioning how nutritious food and drinks vary from Starbucks and Dunkin Donuts, and which restaurant has the most nutritious options overall.

Data Overview and Quality

The three raw datasets were then copied into new datasets, which were then cleaned and prepared for analysis. The first raw dataset, entitled “starbucks.csv”, contains 18 variables and 242 observations, corresponding to 242 different drinks. Three of the variables contain categorical data, corresponding to drink identifiers, and the remaining 15 variables contain numerical data, corresponding to nutritional information. There are some observations (specifically, caffeine) with missing nutritional values, which contain NA instead of the actual numerical values.

The second raw dataset, stored as “starbucks_menu_nutrition_food_redo.csv”, contains 6 variables and 113 observations, corresponding to 113 different food items. One variable contains categorical data, corresponding to the food name. The remaining five variables contain numerical data, corresponding to the nutritional information, such as calories, fat, carbohydrates, fiber, and protein content. There are no missing values in this data set.

The third raw dataset, stored as “dunkindonutsnutrition.csv”, contains 13 variables and 790 observations, corresponding to 780 different products sold by Dunking Donuts. All variables are stored as character vectors. Two variables concern the item category and name, while the remaining variables concern nutritional information. There are no missing values in this data set either.

Starbucks data cleaning and preparation

In the 0a_starbucks_data_preparation R script, the dataset “starbucks_menu_nutrition_food_redo.csv” was read in, copied as starbucks_food_data, and then cleaned. The first issue with the raw data was that the variable names had spaces, which could make it difficult ot select these variables. The spaces were replaced with underscores using the names() function. Also, the variable names were be made into lowercase names for consistency.

The dataset “starbucks.csv” was read in, copied as starbucks_drinks_data, and then cleaned as well. The data frame had the same issue in that the column names had uppercase letters and spaces. This was addressed using the names() function as well. Additionally, in the column names “Total Fat g”, “Dietary Fiber g”, and “Total Carbohydrates g”, the initial word was be removed so these variables would have identical names to the starbucks_food_data dataset. This made facilitated joining observations in both data frames.

In the beverage_prep column of starbucks_drink_data, the size of beverage was missing in some observations but could be deduced from the size listed of the previous (or second previous) beverage. The size was added to this column using a “for loop” to iterate through each observation and if statements to identify the size and then add it to the observation.

Additionally, the information in the beverage_prep column in starbucks_drinks_data was used to create new variables– one for milk type and one for size. First, the column milk_type was developed bt extracting the type of milk from the beverage_prep column. Drinks without any milk intentionally had NA in this column. A for loop was used to iterate through each observation in the beverage_prep column, and if statements were used to identify the type of milk and add it to the milk_type column for each observation. The column size was added by extracting the size of drink from the beverage_prep column. A for loop can was to iterate through each observation in the beverage_prep column, and if statements were used to identify the beverage size and add it to the size column for each observation.

Dunkin Donuts data cleaning and preparation

In the 0b_dunkin_data_preparation R script, the dataset “dunkindonutsnutrition.csv” was read in, copied as dunkin_donuts_data, and then cleaned. First, the variable named were changed with the names function to remove parentheses and spaces. Additionally, the names were made lowercase for consistency.

A new variable, item_type was created as a factor. The factor “drink” was assigned to drinks (based on the category variable) and the factor “food” was assigned to food items. Variables corresponding to nutritional information were converted into numeric vectors, as they generally contained numeric values. Moreover, a new factor variable, size, was created to identify the size of each product. The size was extracted from the item name using the grepl() function. Items that were not given a size or milk_type were assigned a factor of “not applicable”. Similarly, a new factor variable, milk_type, was created to identify the type of milk in each product, if applicable.

Finally, the data was filtered into two new datasets: dunkin_donuts_drinks_data, containing the drink observations; and dunkin_donuts_food_data, containing the food observations.

From the R scripts, the cleaned datasets were downloaded as csv files using the write_csv() function, and then they were accessed using the read_csv() function.

Additional data preparation

Finally, new data frames entitled all_drinks_data and all_food_data were created by joining the two existing food or drink data frames with the merge() function. Columns were renamed in certain instances for consistency across both datasets to be combined. Additionally, a new variable retailer was added to each observation to distinguish Starbucks from Dunkin Donuts products. These data frames were made to facilitate comparison of Starbucks and Dunkin Donuts products.

Explorations

Analysis of Drinks

Nutritional Content by Drink Category

Sugar, Fat, and Calories Content

The first question I sought to answer was which drinks are considered the healthiest. I initially approached this quesiton by comparing the fat, sugar, and calorie content of drinks. In these categories, a beverage with a lower amount of each of these nutrients is considered healthier. For this analysis, I compared drink categories, rather than the individual drinks themselves. This was done because there were far fewer drink categories, so it made it easier to make generalizations about drinks.

Starbucks

The following table and three figures depict the fat, sugar, and calorie content of types of drinks sold by Starbucks.

Starbucks Drinks Nutritional Info
Drink Category Average fat content (g) Average sugar content (g) Average calories
Classic Espresso Drinks 3 17 140
Coffee 0 0 4
Frappuccino Blended Coffee 3 57 277
Frappuccino Blended Creme 2 48 233
Frappuccino Light Blended Coffee 1 32 162
Shaken Iced Beverages 0 26 114
Signature Espresso Drinks 5 39 250
Smoothies 2 37 282
Tazo Tea Drinks 3 30 177

The data above shows that plain coffee has the lowest fat, sugar, and calorie content. This is expected because it has no additives. Aside from coffee, classic espresso drinks had a relatively content of fat, sugar, and calories, which in turn would make them the next healthiest option at Starbucks.

Dunkin Donuts

The following table and three figures depict the fat, sugar, and calorie content of types of drinks sold by Dunkin Donuts.

Dunkin Donuts Drinks Nutritional Info
Drink Category Average fat content (g) Average sugar content (g) Average calories
Cold Brew Coffee 5 8 80
Coolatta 2 96 427
Dunkin Refreshers 5 60 320
Frozen Coffee 13 118 640
Hot Americano 0 0 8
Hot Cappuccino 3 39 223
Hot Chocolate 13 48 368
Hot Coffee 4 27 162
Hot Latte 6 43 274
Hot Macchiato 4 39 233
Iced Americano 0 0 8
Iced Cappuccino 3 39 223
Iced Coffee 3 24 142
Iced Latte 6 43 273
Iced Macchiato 3 39 226
Iced Tea 0 25 109

From the data above, it is clear that Hot and Iced Americano drinks had the lowest content of fat, sugar, and calories, making them the healthiest options offered at Dunkin Donuts. Aside from Americanos, Iced Tea had a low content of fat, while Cold Brew Coffee had a relatively low content of sugar and calories.

Protein, Calcium, and Vitamin A and C Content

I then address this questioned by investigating which beverages at each retailer had the highest amount of protein, calcium, and vitamins. In these categories, a beverage with a high content of each of these nutrients is considered healthier.

Starbucks

The Starbucks drinks data contains information regarding each drink’s protein, calcium, and vitamin A and C content. The content of each of these nutrients is compared in the following table and graphs. A boxplot was used to compare protein content, similar to the other nutrients. However, frequency polygons were used to compare calcium, vitamin A, and vitamin C content due to a lack of data points other than 0%.

Starbucks Drinks Nutritional Info, continued
Drink Category Average protein content (g) Average calcium (%DV) Average vitamin A (%DV) Average vitamin C (%DV)
Classic Espresso Drinks 9 0 0 0
Coffee 1 0 0 0
Frappuccino Blended Coffee 4 0 0 0
Frappuccino Blended Creme 4 0 0 0
Frappuccino Light Blended Coffee 4 0 0 0
Shaken Iced Beverages 1 0 0 0
Signature Espresso Drinks 10 0 0 0
Smoothies 17 0 0 1
Tazo Tea Drinks 7 0 0 0

From the data above, it is clear that smoothies had the most of these nutrients overall of Starbucks drinks. smoothies had the highest protein, vitamin A, and vitamin C content. This was as expected, since smoothies are known to contain many nutritious fruits and even vegetables. Similarly, Signature Espresso Drinks and Tazo Tea Drinks had a relatively high content of calcium.

Dunkin Donuts

The Dunkin Donuts drinks data only contains information regarding each drink’s protein content, and not of vitamins or minerals. The protein content of each type of drink sold at Dunkin is compared in the following table and boxplot.

Dunkin Donuts Drinks Nutritional Info, continued
Drink Category Average protein content (g)
Cold Brew Coffee 1
Coolatta 2
Dunkin Refreshers 3
Frozen Coffee 7
Hot Americano 0
Hot Cappuccino 7
Hot Chocolate 3
Hot Coffee 3
Hot Latte 10
Hot Macchiato 8
Iced Americano 0
Iced Cappuccino 7
Iced Coffee 3
Iced Latte 10
Iced Macchiato 7
Iced Tea 0

The data indicate that iced and hot lattes had the highest content of protein of all Dunkin Donuts drinks. This is as expected, as lattes are generally made with protein-rich milk. Nonetheless, the data does not indicate much about concentration of other critical nutrients.

Nutritional Content by Milk Type

Another question I sought to answer is how the nutrition content of each drink varies by milk type. When people order drinks from coffee shops, there are many different ways they can customize their drink to their liking. Thus, I found it important to analyze nutrition by milk preference. For this analysis, I measured the fat, sugar, calories, and then protein of drinks having grouped them by milk type. I created boxplots to depict how these nutrients vary by each type of milk.

Starbucks

The following table and boxplots depict the average fat, sugar, calories, and protein for drinks of each type of milk at Starbucks. Observations with no milk were excluded from this analysis.

Starbucks Drinks Average Nutritional Info by Type of Milk
Milk Type Fat (g) Sugar content (g) Calories Protein content (g)
2% Milk 6 31 218 10
Nonfat Milk 1 36 190 8
Soymilk 4 32 207 7
Whole Milk 5 56 284 4
NA 0 17 75 0

As expected, nonfat milk drinks had the lowest average content of fat. Nonetheless, 2%, nonfat, and soymilk drinks had roughly equivalent amounts of sugar and calories. For these drinks, it can be deduced that the type of milk has little impact on the sugar and calories, so they must be relatively similar in sugar and calorie content. Moreover, the actual drink itself has a greater impact on these nutrients than the type of milk. Lastly, 2% milk drinks had the highest protein. In total, the type of milk one should order at Starbucks depends highly on which nutrients they seek to maximize or minimize in their diet. There is no “one-size-fits-all” milk to choose from.

Dunkin Donuts

The following table and boxplots reports the average fat, sugar, calories, and protein for drinks of each type of milk at Dunkin Donuts. Observations with no milk were excluded from this analysis.

Dunkin Donuts Drinks Average Nutritional Info by Type of Milk
Milk Type Fat (g) Sugar content (g) Calories Protein content (g)
cream 13 48 332 4
skim 1 44 229 8
whole 7 44 283 8

Due to large overlap of boxes in the boxplots, as well as an abundance of outliers, I added density plots to better distinguish the nutritional information of each type of milk.

From the Dunkin Donuts milk data, it is clear skim (nonfat) milk drinks had the lowest amount of fat. Sugar and calorie content was roughly equal for skim milk, whole milk, and cream drinks on average. These results are similar to the Starbucks data. Lastly, drinks made with skim and whole milk were highest in protein.

Overall

Upon reviewing both the Starbucks and Dunkin Donuts milk data, one should order skim milk drinks in order to minimize their fat content. Their milk choice has little to no effect on their intake of sugar and calories. And ordering milk with some degree of fat provides the most protein.

New metrics to assess healthiness of drinks

In the previous section, I analyzed which typs of beverage had high (or low) contents of specific nutrients. There are more nuanced ways to assess the “healthiness” of a drink. In this section, I am to incorporate multiple variables to develop new ways to analyze healthiness.

Beverages rich in protein, low in calories

First, I sought to determine which types of beverages were richest in protein while lowest unhealthy quantities, such as calories and fat. I compared the ratio of protein to calories for each drink type. Nevertheless, protein content can also be compared to sugar, fat, or other quantities for different metrics.

Starbucks

In the following table and scatterplot, I compare the ratio of grams of protein to calories for types of drinks at Starbucks. Due to difficulty in distinguishing data points in the scatterplot, I added faceted scatterplots as well.

Starbucks Protein to Calories
Drink Category Protein content (g) Calories Ratio of protein (g) to calories
Classic Espresso Drinks 9 140 0.0643
Coffee 1 4 0.2500
Frappuccino Blended Coffee 4 277 0.0144
Frappuccino Blended Creme 4 233 0.0172
Frappuccino Light Blended Coffee 4 162 0.0247
Shaken Iced Beverages 1 114 0.0088
Signature Espresso Drinks 10 250 0.0400
Smoothies 17 282 0.0603
Tazo Tea Drinks 7 177 0.0395

Among Starbucks drinks, it was found that coffee alone had the highest ratio of protein to calories. This is rather misleading, because coffee is very low-calorie to begin with. Aside from coffee, Classic Espresso Drinks and then smoothies had the next highes ratios of protein to calories. This caveat indicates that more methods are needed to assess the most nutritious drink types overall.

Dunkin Donuts

In the following table and scatterplot, I compared the same quantities for drinks at Dunkin Donuts. A single scatterplot and faceted scatterplots were included as well.

Protein to Calories, Dunkin
Drink Category Protein content (g) Calories Ratio of protein (g) to calories
Cold Brew Coffee 1 80 0.0125
Coolatta 2 427 0.0047
Dunkin Refreshers 3 320 0.0094
Frozen Coffee 7 640 0.0109
Hot Americano 0 8 0.0000
Hot Cappuccino 7 223 0.0314
Hot Chocolate 3 368 0.0082
Hot Coffee 3 162 0.0185
Hot Latte 10 274 0.0365
Hot Macchiato 8 233 0.0343
Iced Americano 0 8 0.0000
Iced Cappuccino 7 223 0.0314
Iced Coffee 3 142 0.0211
Iced Latte 10 273 0.0366
Iced Macchiato 7 226 0.0310
Iced Tea 0 109 0.0000

The data indicate that hot and iced lattes have the highest ratio of protein to calories among drinks at Dunkin Donuts. This is consistent with earlier results, which found latte drinks to have high protein contents and be relatively low calorie.

Beverages rich in vitamins

Another way I sought to assess how healthy beverages were was by assessing their overall vitamin and mineral contents. Unfortunately, this information was not available for the Dunkin Donuts drinks data. The Starbucks drinks data contains information regarding vitamin A, vitamin C, calcium, and iron. For this analysis, I took the sum of the listed percent daily value of each of these nutrients, and divided by four. This calculation results in an estimation of each drink type’s percent daily value of all vitamins and minerals that one should consume in a day (i.e. how much of your daily vitamin and mineral content are you getting through each drink). The following table contains information regarding the average vitamin/mineral percent daily value for each drink type and Starbucks.

Average Percent Daily Value of Vitamins and Minerals, Starbucks
Drink Category Vitamin A (%DV) Vitamin C (%DV) Calcium (%DV) Iron (%DV) Average %DV of Vitamins and Minerals
Classic Espresso Drinks 0.1272414 0.0020690 0.2734483 0.0810345 0.1209483
Coffee 0.0000000 0.0000000 0.0050000 0.0000000 0.0012500
Frappuccino Blended Coffee 0.0550000 0.0000000 0.1227778 0.1086111 0.0715972
Frappuccino Blended Creme 0.0615385 0.0461538 0.1384615 0.0384615 0.0711538
Frappuccino Light Blended Coffee 0.0600000 0.0000000 0.1133333 0.0900000 0.0658333
Shaken Iced Beverages 0.0177778 0.0250000 0.0400000 0.0066667 0.0223611
Signature Espresso Drinks 0.1315000 0.0075000 0.3137500 0.1055000 0.1395625
Smoothies 0.2044444 0.6500000 0.1333333 0.1177778 0.2763889
Tazo Tea Drinks 0.1055769 0.0290385 0.2365385 0.0467308 0.1044712

The data is largely inconclusive, given how low the values are and how difficult it is to differentiate them. Nevertheless, it was found that smoothies on average have the highest proportion of the aforementioned vitamins and minerals. This is consistent with earlier findings that supported smoothies as a nutrient-rich drink. Given the data, I would recommend smoothies as the best option for Starbucks consumers looking to maximize their intake of vitamins and minerals.

Analysis of food data

Similar to the drinks, I then sought to investigate which food items offered are considered the healthiest. The data regarding the beverages offered at Starbucks and Dunkin Donuts was largely similar and contained the same nutritional information. However, the two food datasets, starbucks_food_data and dunkin_donuts_food_data have far less similarities. The Starbucks food data contains less nutrient information and does not categorize the food items. The Dunkin food data is more detailed and labels the food items at Dunkin Donuts by type. Thus, for this section of the EDA, the analysis was split for the starbucks_food_data and dunkin_donuts_food_data.

Nutritional information of Starbucks food items

Univariate Analysis

First, I identified the healthiest Starbucks food items in five separate categories: highest in protein, highest in fiber, lowest in fat, lowest in carbohydrates, and lowest in calories. The top five food items in each category are listed in the tables below.

Highest Protein Foods, Starbucks
item protein_g
Turkey Pesto Panini 34
Roasted Turkey & Dill Havarti Sandwich 32
Turkey & Havarti Sandwich 29
Za'atar Chicken & Lemon Tahini Salad 27
Chicken & Quinoa Protein Bowl with Black Beans and Greens 27
Highest Fiber Foods, Starbucks
item fiber_g
Lentils & Vegetable Protein Bowl with Brown Rice 21
Za'atar Chicken & Lemon Tahini Salad 11
Strawberries & Jam Sandwich 10
Green Goddess Avocado Salad 10
Chicken & Quinoa Protein Bowl with Black Beans and Greens 9
Least Fat Foods, Starbucks
item fat_g
Seasonal Fruit Blend 0.0
Cinnamon Raisin Bagel 1.0
Plain Bagel 1.5
Classic Whole-Grain Oatmeal 2.5
Hearty Blueberry Oatmeal 2.5
Lowest Carbohydrate Foods, Starbucks
item carb_g
Organic Avocado (Spread) 5
Justin's Classic Almond Butter 6
Cauliflower Tabbouleh Side Salad 7
Garden Greens & Shaved Parmesan Side Salad 9
Sous Vide Egg Bites: Bacon & Gruyere 9
Lowest Calorie Foods, Starbucks
item calories
Frappuccino Cookie Straw 90
Organic Avocado (Spread) 90
Seasonal Fruit Blend 90
Everybody's Favorite - Bantam Bagel (2 Pack) 100
Petite Vanilla Bean Scone 120

The results from the tables above can help consumers in selecting a food item if they are looking for a specific nutrient (or lack of a nutrient) in their diet. However, the lack of categorization of this data makes it difficult for one to make generalizations about healthy, nutritious food options at Starbucks.

Multivariate Analysis

Nonetheless, I also wanted to assess the healthiness of Starbucks food items while integrating multiple variables. Thus, for each food item, I multipled the grams of fat by the grams of fiber, and then divided by the grams of carbohydrates, grams of fat, and calories. This metric uses all five variables, whereas other metrics can use some combination of these variables.

Health Ratio of Starbucks Food Items
Food Item Protein (g) Fiber (g) Fat (g) Carbohydrates (g) fiber_g Healthiness Ratio
Seasonal Fruit Blend 1 4 0 24 4 Inf
Berry Trio Yogurt 14 3 2 39 3 0.5385
Multigrain Bagel 17 8 4 64 8 0.5312
Cinnamon Raisin Bagel 9 3 1 58 3 0.4655
Certified Gluten-Free Breakfast Sandwich 18 6 13 18 6 0.4615
Classic Whole-Grain Oatmeal 5 4 2 28 4 0.3571
Spinach Feta & Cage Free Egg White Breakfast Wrap 19 6 10 33 6 0.3455
Chicken & Quinoa Protein Bowl with Black Beans and Greens 27 9 17 42 9 0.3403
Fresh Blueberries and Honey Greek Yogurt Parfait 14 2 2 42 2 0.3333
Hearty Blueberry Oatmeal 5 5 2 43 5 0.2907

Upon comparing the ratios, the seasonal fruit blend appears to be the healthiest according to this metric. However, one can see the seasonal fruit blend has 0 grams of fat, which gives it a ratio of infinity. This makes this metric misleading and difficult to compare. The next healthiest option is the berry trio yogurt. Overall, the caveat in readings of this ratio indicate that more nuanced metrics are needed to compare healthiness of food options at Starbucks.

Nutritional Information of Dunkin Donuts Food Items

Univariate Analysis

I first sought to analyze which categories of food items from Dunkin Donuts are the healthiest. The following table records the average protein, fiber, carbohydrate, fat and calorie content in each food category. The following frequency polygons illustrate the protein, fiber, carbohydrate, fat, and calorie information for each category of food items.

Dunkin Donuts Food Nutrient Information
Food Category Average protein content (g) Average fiber content (g) Average carbohydrate content (g) Average fat content (g) Average calories
Donuts 5 1 54 17 389
Hash Browns 5 2 12 6 130
Kolache 20 2 38 21 424
Muffins 4 1 32 8 217
Sandwiches 16 2 36 20 390
Soft Serve 14 2 64 18 465

The data indicates that among Dunkin Donuts food items, kolaches were the highest in protein. It was rather inconclusive which food items were highest in fiber, as all had relatively similar amounts. Nonetheless, hash browns were the clear lowest in carbohydrates, fat, and calories, suggesting that they are the healthiest option overall.

Multivariate Analysis

It was also sought to determine the healthiness of Dunkin Donuts food categories using multiple variables simultaneously. Thus, I applied a new metric, the “healthiness metric”, to assess healthiness of each food category. The average protein content was multiplied by the average fiber content, which were then divided by the average carbohydrate and fat content. The results are in the table.

Dunkin Donuts Food Categories' Healthiness Ratio
Food Category Protein (g) Fiber (g) Fat (g) Carbohydrates (g) Healthiness Ratio
Hash Browns 5 2 6 12 0.1389
Kolache 20 2 21 38 0.0501
Sandwiches 16 2 20 36 0.0444
Soft Serve 14 2 18 64 0.0243
Muffins 4 1 8 32 0.0156
Donuts 5 1 17 54 0.0054

The data indicates that sandwiches were the healthiest option, as they had the highest healthiness ratio. This appears to be a more accurate metric of healthiness than comparing individual quantities of nutrients, as this accounts for differences in item portions. Thus, one can say with fair confidence that sandiwches are the healthiest option at Dunkin Donuts. This is as expected, as they are known to contain many different nutrients.

Analysis of Individual Donuts

Additionally, Dunkin Donuts is arguably best known for their donuts, so I thought it was fitting to analyze the individual donuts themselves. Many consumers come specifically to eat their donuts, so this information can help consumers select the right donut based on their dietary preferences.

Dunkin Donuts' Donuts Nutritional Information and Healthiness Ratio
Food Item Protein (g) Fat (g) fat Carbohydrates (g) fiber_g Healthiness Ratio
Coffee Roll 7 2 19 48 2 0.0154
Sugared Donut 4 1 11 24 1 0.0152
Eclair 6 2 16 50 2 0.0150
Toasted Coconut Donut 5 3 22 52 3 0.0131
Apple Spice Donut 4 1 10 31 1 0.0129
Lemon Donut 4 1 10 31 1 0.0129
Frozen Chocolate, Large 10 4 18 175 4 0.0127
Guava Donut (Regional) 4 1 10 32 1 0.0125
Dulce de Leche Donut (Regional) 5 1 11 37 1 0.0123
Peanut Donut 8 2 27 50 2 0.0119

The table above shows the 10 donuts with the highest healthiness ratio at Dunkin Donuts. Consumers can use this information as a guide to select a donut that meets their dietary wants.

Asessment of Accuracy of Metrics

Finally, I wanted to use the Dunkin Donuts food data to evaluate how accurate the healthiness metric is assessing the healthiness of a food item. The metric is based on the assumption that the protein and fiber contents (the healthy variables) are inversely proportion to the carbohydrate and fat contents (the unhealthy variables). In order to assess this relationship, the following scatterplots were made comparing the healthy variables to the unhealthy variables. Protein, fat, fiber, and carbohydrate were assessed per calorie in order to account for differences in portion sizes.

In the scatterplots above regarding Dunkin Donuts food, the slope of line of best fit indicates the correlation of the two variables. It was found that protein and fat are largely uncorrelated, as their scatterplot had a relatively horizontal line of best fit. Fiber was inversely correlated with fat, while protein was inversely correlated with carbohydrates; both of these findings support the validity of metric. Lastly, fiber was directly correlated with carbohydrates. In total, these findings indicate that the metric is inherently flawed to categorize fiber and protein as “healthy” nutrients and carbohydrates as fat as “unhealthy” nutrients. A more nuanced metric is needed to better assess health of food products.

Comparison of Starbucks and Dunkin Donuts Products

Having discussed the healthiest drink and the food products at both Starbucks and Dunkin Donuts, I wanted to directly compare the healthiness of products at each of these retailers.

Comparison of Drinks

I first compared drinks at Dunkin Donuts and Starbucks by their content of protein, sugar, fat and calories. The results are shown in the table and subsequent histograms and frequency polygons below.

Starbucks vs Dunkin Donuts Drinks Nutrition
Retailer Average protein (g) Average sugar (g) Average fat (g) Average calories
Dunkin Donuts 6 43 5 250
Starbucks 7 33 3 194

Overall, the table indicates that Starbucks drinks are higher in protein and fiber, while lower in carbohydrates, fat, and calories overall. This suggests that Starbucks has the healthier drink options. Nevertheless, the histograms and frequency polygons clearly have great overlap. One can reasonably assume that the retailer has a minimal impact on the healthiness of a drink.

Comparison of Food Items

Lastly, the nutrition of Starbucks and Dunkin Donuts food items were compared in the following table and histogram and frequency polygons.

Starbucks vs Dunkin Donuts Food Nutrition
Retailer Average protein (g) Average carbohydrates (g) Average fiber (g) Average fat (g) Average calories
Dunkin Donuts 10 45 1 17 378
Starbucks 11 41 3 16 357

Data figures:

Starbucks food items had more protein and fiber, and fewer carbohydrates, fat, and calories, on average in comparison to Dunkin Donuts. This reinforces that Starbucks has healthier products overall. Once again, the great overlap of the histogram and frequency polygons indicates that that retailer and nutrition metrics are poorly correlated. Other factors aside from retailer have a greater impact on the nutrition of a food item.

Conclusions

Drinks Analysis

Univariate analysis found that coffee had the least unhealthy nutrients, while smoothies had the most healthy nutrients at Starbucks. Similarly, Americanos had the least unhealthy nutrients, while lattes were most nutrient-rich at Dunkin Donuts. Among both retailers, drinks made with skim milk were lowest in fat, while milk type had little effect on the sugar, calories, and protein of each drink. New metrics were developed to assess healthiness of drinks in a more nuanced manner. It was found that coffee at Starbucks and lattes at Dunkin Donuts had the highest ratio of protein to calories. Moreover, at Starbucks, smoothies had the highest proportion of vitamins and minerals. Given the many different definitions of “healthiness” used in this analysis, it was largely inconclusive which drink from either retailer can be deemed the healthiest, and whether these drinks are consistent across retailers.

Food Analysis

Lack of organization and categorization of Starbucks food data made it difficult to identify healthy food options. Individual food items were identified, but no trends in food options could be deduced for Starbucks products. On the other hand, the organization of the Dunkin Donuts food data permitted univariate and multivariate analyses. Univariate analyses found hash browns to maximize individual nutrients the most. A new metric was developed using multiple variables to assess healthiness of food items at Dunkin Donuts. Among all food products, the metric found sandwiches to be the healthiest option. Given that the metric assumes protein and fiber are inversely correlated with carbohydrates and fat, the relationships of the two “healthy” variables with the two “unhealthy” variables was assessed. The results indicated that the assumptions of the metric are inherently flawed, and more nuanced metrics are necessary.

Comparison of Starbucks and Dunkin Donuts

Comparing Starbucks and Dunkin Donuts data found that Starbucks had healthier food and drink options on average, based on five nutrients for drinks and four nutrients for food. Nevertheless, there was great overlap of data between the two retailers, indicating that retailer has little to no effect on the healthiness of food and drink products in this case. This was as expected, as both retailers offer a large variety of products that differ in their nutrition.

Next Steps

The overarching dilemma in this exploratory data analysis was that there are several ways to define how “healthy” a drink or food item is. Comparing individual nutrients, or combinations of nutrients, demonstrated flaws in this analysis. A potential next step is developed a more standardized definition of healthiness. Opinions of professions, such as dieticians and nutritionists, can be incorporated into deveoping a more standardized definition of healthiness. This definition can then be used to develop a metric to assess healthiness of drinks and food items based on their nutritional information. Moreover, more detailed data sets, included information of other nutrients, can be incorporated to build more nuanced metrics in assessing healthiness.

References

Arvidsson, J. (2023, September) Dunkin’ Donuts’ Nutrition: Dunkin’ Donuts’ Menu Nutrition, Micronutrients, and Calorie Information. Kaggle. https://www.kaggle.com/datasets/joebeachcapital/dunkin-donuts-nutrition

Sanchez-Arias, R. (2023, October 19) Sample Datasets: A collection of datasets from multiple sources to be used for demonstrations in data science courses. GitHub. https://github.com/reisanar/datasets

Starbucks. (2017) Nutrition factors for Starbucks: Nutrition information for Starbucks menu items, including food and drinks. Kaggle. https://www.kaggle.com/datasets/starbucks/starbucks-menu/data